A modular tool to aggregate results from bioinformatics analyses across many samples into a single report.
Report
generated on 2025-04-17, 13:39 UTC
based on data in:
/home/runner/work/pmultiqc/pmultiqc/data
pmultiqc
pmultiqc is a MultiQC module to show the pipeline performance of mass spectrometry based quantification pipelines such as nf-core/quantms, MaxQuant.URL: https://github.com/bigbio/pmultiqc
Parameters
MaxQuant parameters, extracted from parameters.txt, summarizes the settings used for the MaxQuant analysis. Key parameters are MaxQuant version, Re-quantify, Match-between-runs and mass search tolerances. A list of protein database files is also provided, allowing to track database completeness and database version information (if given in the filename).
| No. | Parameter | Value |
|---|---|---|
| 1 | Version | 1.5.2.8 |
| 2 | User name | cbielow |
| 3 | Machine name | CD02-WIN7 |
| 4 | Date of writing | 08/05/2015 11:38:59 |
| 5 | Fixed modifications | Carbamidomethyl (C) |
| 6 | Decoy mode | revert |
| 7 | Special AAs | KR |
| 8 | Include contaminants | True |
| 9 | MS/MS tol. (FTMS) | 20 ppm |
| 10 | Top MS/MS peaks per 100 Da. (FTMS) | 12 |
| 11 | MS/MS deisotoping (FTMS) | True |
| 12 | MS/MS tol. (ITMS) | 0.5 Da |
| 13 | Top MS/MS peaks per 100 Da. (ITMS) | 8 |
| 14 | MS/MS deisotoping (ITMS) | False |
| 15 | MS/MS tol. (TOF) | 40 ppm |
| 16 | Top MS/MS peaks per 100 Da. (TOF) | 10 |
| 17 | MS/MS deisotoping (TOF) | True |
| 18 | MS/MS tol. (Unknown) | 0.5 Da |
| 19 | Top MS/MS peaks per 100 Da. (Unknown) | 8 |
| 20 | MS/MS deisotoping (Unknown) | False |
| 21 | PSM FDR | 0.0 |
| 22 | Protein FDR | 0.0 |
| 23 | Site FDR | 0.0 |
| 24 | Use Normalized Ratios For Occupancy | True |
| 25 | Min. peptide Length | 7 |
| 26 | Min. score for unmodified peptides | 0 |
| 27 | Min. score for modified peptides | 40 |
| 28 | Min. delta score for unmodified peptides | 0 |
| 29 | Min. delta score for modified peptides | 6 |
| 30 | Min. unique peptides | 0 |
| 31 | Min. razor peptides | 1 |
| 32 | Min. peptides | 1 |
| 33 | Use only unmodified peptides and | True |
| 34 | Modifications included in protein quantification | Acetyl (Protein N-term);Oxidation (M) |
| 35 | Peptides used for protein quantification | Razor |
| 36 | Discard unmodified counterpart peptides | True |
| 37 | Min. ratio count | 2 |
| 38 | Re-quantify | False |
| 39 | Use delta score | False |
| 40 | iBAQ | False |
| 41 | iBAQ log fit | False |
| 42 | Match between runs | True |
| 43 | Matching time window [min] | 0.7 |
| 44 | Alignment time window [min] | 20 |
| 45 | Find dependent peptides | False |
| 46 | Fasta file | crap_withMycoplasma.fasta;uniprot_human_canonical_and_isoforms_20130513.fasta |
| 47 | Labeled amino acid filtering | True |
| 48 | Site tables | Oxidation (M)Sites.txt |
| 49 | RT shift | False |
| 50 | Advanced ratios | True |
| 51 | First pass AIF correlation | 0.8 |
Intensity Distribution
Intensity boxplots by experimental groups. Groups are user-defined during MaxQuant configuration. This plot displays a (customizable) threshold line for the desired mean intensity of proteins. Groups which underperform here, are likely to also suffer from a worse MS/MS id rate and higher contamination due to the lack of total protein loaded/detected. If possible, all groups should show a high and consistent amount of total protein.
The height of the bar correlates to the number of proteins with non-zero abundance.
LFQ Intensity Distribution
Label-free quantification (LFQ) intensity boxplots by experimental groups.
Label-free quantification (LFQ) intensity boxplots by experimental groups. Groups are user-defined during MaxQuant configuration. This plot displays a (customizable) threshold line for the desired mean of LFQ intensity of proteins. Raw files which underperform in Raw intensity, are likely to show an increased mean here, since only high-abundance proteins are recovered and quantifyable by MaxQuant in this Raw file. The remaining proteins are likely to receive an LFQ value of 0 (i.e. do not contribute to the distribution).
The height of the bar correlates to the number of proteins with non-zero abundance.
PCA of Raw Intensity
[Excludes Contaminants] Principal components plots of experimental groups (as defined during MaxQuant configuration).
This plot is shown only if more than one experimental group was defined. If LFQ was activated in MaxQuant, an additional PCA plot for LFQ intensities is shown. Similarly, if iTRAQ/TMT reporter intensities are detected. Since experimental groups and Raw files do not necessarily correspond 1:1, this plot cannot use the abbreviated Raw file names, but instead must rely on automatic shortening of group names.
PCA of LFQ Intensity
[Excludes Contaminants] Principal components plots of experimental groups (as defined during MaxQuant configuration).
This plot is shown only if more than one experimental group was defined. If LFQ was activated in MaxQuant, an additional PCA plot for LFQ intensities is shown. Similarly, if iTRAQ/TMT reporter intensities are detected. Since experimental groups and Raw files do not necessarily correspond 1:1, this plot cannot use the abbreviated Raw file names, but instead must rely on automatic shortening of group names.
MS/MS Identified per Raw File
MS/MS identification rate per Raw file from summary.txt.
TODO: add description here @Yasset
Peptide Intensity Distribution
Peptide precursor intensity per Raw file from evidence.txt WITHOUT match-between-runs evidence.
Peptide precursor intensity per Raw file from evidence.txt WITHOUT match-between-runs evidence. Low peptide intensity usually goes hand in hand with low MS/MS identifcation rates and unfavourable signal/noise ratios, which makes signal detection harder. Also instrument acquisition time increases for trapping instruments. Failing to reach the intensity threshold is usually due to unfavorable column conditions, inadequate column loading or ionization issues. If the study is not a dilution series or pulsed SILAC experiment, we would expect every condition to have about the same median log-intensity (of 2%1.1f). The relative standard deviation (RSD) gives an indication about reproducibility across files and should be below 5%%.
Potential Contaminants per Group
Potential contaminants per group from proteinGroups.txt.
External protein contamination should be controlled for, therefore MaxQuant ships with a comprehensive, yet customizable protein contamination database, which is searched by MaxQuant by default. A contamination plot derived from the proteinGroups (PG) table, showing the fraction of total protein intensity attributable to contaminants.
Note that this plot is based on experimental groups, and therefore may not correspond 1:1 to Raw files.
Top5 Contaminants per Raw file
The five most abundant external protein contaminants by Raw file
pmultiqc will explicitly show the five most abundant external protein contaminants (as detected via MaxQuant's contaminants FASTA file) by Raw file, and summarize the remaining contaminants as 'other'. This allows to track down which proteins exactly contaminate your sample. Low contamination is obviously better.
If you see less than 5 contaminants, it either means there are actually less, or that one (or more) of the shortened contaminant names subsume multiple of the top5 contaminants (since they start with the same prefix).
Charge-state of per Raw file
The distribution of the charge-state of the precursor ion, excluding potential contaminants.
The distribution of the charge-state of the precursor ion, excluding potential contaminants.
Modifications per Raw file
Compute an occurence table of modifications (e.g. Oxidation (M)) for all peptides, including the unmodified.
Post-translational modifications contained within the identified peptide sequence.
Peptide ID Count
[Excludes Contaminants] Number of unique (i.e. not counted twice) peptide sequences including modifications (after FDR) per Raw file.
If MBR was enabled, three categories ('Genuine (Exclusive)', 'Genuine + Transferred', 'Transferred (Exclusive)' are shown, so the user can judge the gain that MBR provides. Peptides in the 'Genuine + Transferred' category were identified within the Raw file by MS/MS, but at the same time also transferred to this Raw file using MBR. This ID transfer can be correct (e.g. in case of different charge states), or incorrect -- see MBR-related metrics to tell the difference. Ideally, the 'Genuine + Transferred' category should be rather small, the other two should be large.
If MBR would be switched off, you can expect to see the number of peptides corresponding to 'Genuine (Exclusive)' + 'Genuine + Transferred'. In general, if the MBR gain is low and the MBR scores are bad (see the two MBR-related metrics), MBR should be switched off for the Raw files which are affected (could be a few or all).
ProteinGroups Count
[Excludes Contaminants] Number of Protein groups (after FDR) per Raw file.
If MBR was enabled, three categories ('Genuine (Exclusive)', 'Genuine + Transferred', 'Transferred (Exclusive)' are shown, so the user can judge the gain that MBR provides. Here, 'Transferred (Exclusive)' means that this protein group has peptide evidence which originates only from transferred peptide IDs. The quantification is (of course) always from the local Raw file. Proteins in the 'Genuine + Transferred' category have peptide evidence from within the Raw file by MS/MS, but at the same time also peptide IDs transferred to this Raw file using MBR were used. It is not unusual to see the 'Genuine + Transferred' category be the rather large, since a protein group usually has peptide evidence from both sources. To see of MBR worked, it is better to look at the two MBR-related metrics.
If MBR would be switched off, you can expect to see the number of protein groups corresponding to 'Genuine (Exclusive)' + 'Genuine + Transferred'. In general, if the MBR gain is low and the MBR scores are bad (see the two MBR-related metrics), MBR should be switched off for the Raw files which are affected (could be a few or all).
Oversampling
An oversampled 3D-peak is defined as a peak whose peptide ion (same sequence and same charge state) was identified by at least two distinct MS2 spectra in the same Raw file.
For high complexity samples, oversampling of individual 3D-peaks automatically leads to undersampling or even omission of other 3D-peaks, reducing the number of identified peptides. Oversampling occurs in low-complexity samples or long LC gradients, as well as undersized dynamic exclusion windows for data independent acquisitions.
Missed Cleavages per Raw file
[Excludes Contaminants] Missed Cleavages per Raw file.
Under optimal digestion conditions (high enzyme grade etc.), only few missed cleavages (MC) are expected. In general, increased MC counts also increase the number of peptide signals, thus cluttering the available space and potentially provoking overlapping peptide signals, biasing peptide quantification. Thus, low MC counts should be favored. Interestingly, it has been shown recently that incorporation of peptides with missed cleavages does not negatively influence protein quantification (see Chiva, C., Ortega, M., and Sabido, E. Influence of the Digestion Technique, Protease, and Missed Cleavage Peptides in Protein Quantitation. J. Proteome Res. 2014, 13, 3979-86 ). However this is true only if all samples show the same degree of digestion. High missed cleavage values can indicate for example, either a) failed digestion, b) a high (post-digestion) protein contamination, or c) a sample with high amounts of unspecifically degraded peptides which are not digested by trypsin.
If MC>=1 is high (>20%) you should increase the missed cleavages settings in MaxQuant and compare the number of peptides. Usually high MC correlates with bad identification rates, since many spectra cannot be matched to the forward database.
In the rare case that 'no enzyme' was specified in MaxQuant, neither scores nor plots are shown.
IDs over RT
Distribution of retention time, derived from the evidence table.
The uncalibrated retention time in minutes in the elution profile of the precursor ion, and does not include potential contaminants.
Peak width over RT
Distribution of widths of peptide elution peaks, derived from the evidence table.
The distribution of the widths of peptide elution peaks, derived from the evidence table and excluding potential contaminants, is one parameter of optimal and reproducible chromatographic separation.
Uncalibrated Mass Error
[Excludes Contaminants] Mass accurary before calibration.
Mass error of the uncalibrated mass-over-charge value of the precursor ion in comparison to the predicted monoisotopic mass of the identified peptide sequence.
Calibrated Mass Error
Mass accuracy after calibration (Excludes Contaminants).
Mass error of the recalibrated mass-over-charge value of the precursor ion in comparison to the predicted monoisotopic mass of the identified peptide sequence in parts per million.
TopN
This metric somewhat summarizes "TopN over RT"
Reaching TopN on a regular basis indicates that all sections of the LC gradient deliver a sufficient number of peptides to keep the instrument busy. This metric somewhat summarizes "TopN over RT".
TopN over RT
TopN over retention time.
TopN over retention time. Similar to ID over RT, this metric reflects the complexity of the sample at any point in time. Ideally complexity should be made roughly equal (constant) by choosing a proper (non-linear) LC gradient. See Moruz 2014, DOI: 10.1002/pmic.201400036 for details.
Ion Injection Time over RT
Ion injection time score - should be as low as possible to allow fast cycles. Correlated with peptide intensity. Note that this threshold needs customization depending on the instrument used (e.g., ITMS vs. FTMS).